Compositions of Tree-to-Tree Statistical Machine Translation Models
نویسنده
چکیده
Compositions of well-known tree-to-tree translation models used in statistical machine translation are investigated. Synchronous context-free grammars are closed under composition in both the unweighted as well as the weighted case. In addition, it is demonstrated that there is a close connection between compositions of synchronous tree-substitution grammars and compositions of certain tree transducers because the intermediate trees can encode finite-state information. Utilizing these close ties, the composition closure of synchronous treesubstitution grammars is identified in the unweighted and weighted case. In particular, in the weighted case, these results build on a novel lifting strategy that will prove useful also in other setups.
منابع مشابه
Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers
In this paper I present a Master’s thesis proposal in syntax-based Statistical Machine Translation. I propose to build discriminative SMT models using both tree-to-string and tree-to-tree approaches. Translation and language models will be represented mainly through the use of Tree Automata and Tree Transducers. These formalisms have important representational properties that makes them well-su...
متن کاملzoning of flood hazard in Nowshahr city using machine learning models
The aim of this study is to predict and model flood hazard in the city of Nowshahr, Mazandaran province using machine learning models. The criteria and indicators affecting flood hazard were identified based on the review of resources, and then the indicators were converted into rasters in ArcGIS environment, and finally standardized by fuzzy method for use in the models. K-nearest neighbor ...
متن کاملStatistical Machine Translation Part II: Tree-Based SMT
One of the most active and promising areas of statistical machine translation (SMT) research are tree-based SMT approaches. Tree-based SMT has the potential to overcome the weaknesses of early SMT architectures which (a) do not handle long-distance dependencies well, and (b) are underconstrained in that they allow too much flexibility in word reordering. In this tutorial, we will review the var...
متن کاملForest-based Tree Sequence to String Translation Model
This paper proposes a forest-based tree sequence to string translation model for syntaxbased statistical machine translation, which automatically learns tree sequence to string translation rules from word-aligned sourceside-parsed bilingual texts. The proposed model leverages on the strengths of both tree sequence-based and forest-based translation models. Therefore, it can not only utilize for...
متن کاملSub-Sentence Division for Tree-Based Machine Translation
Tree-based statistical machine translation models have made significant progress in recent years, especially when replacing 1-best trees with packed forests. However, as the parsing accuracy usually goes down dramatically with the increase of sentence length, translating long sentences often takes long time and only produces degenerate translations. We propose a new method named subsentence div...
متن کامل